skip to main content


Search for: All records

Creators/Authors contains: "Saebi, Mandana"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The lack of publicly available, large, and unbiased datasets is a key bottleneck for the application of machine learning (ML) methods in synthetic chemistry. Data from electronic laboratory notebooks (ELNs) could provide less biased, large datasets, but no such datasets have been made publicly available. The first real-world dataset from the ELNs of a large pharmaceutical company is disclosed and its relationship to high-throughput experimentation (HTE) datasets is described. For chemical yield predictions, a key task in chemical synthesis, an attributed graph neural network (AGNN) performs as well as or better than the best previous models on two HTE datasets for the Suzuki–Miyaura and Buchwald–Hartwig reactions. However, training the AGNN on an ELN dataset does not lead to a predictive model. The implications of using ELN data for training ML-based models are discussed in the context of yield predictions. 
    more » « less
  2. Complex systems, represented as dynamic networks, comprise of components that influence each other via direct and/or indirect interactions. Recent research has shown the importance of using Higher-Order Networks (HONs) for modeling and analyzing such complex systems, as the typical Markovian assumption in developing the First Order Network (FON) can be limiting. This higher-order network representation not only creates a more accurate representation of the underlying complex system, but also leads to more accurate network analysis. In this paper, we first present a scalable and accurate model, BuildHON+, for higher-order network representation of data derived from a complex system with various orders of dependencies. Then, we show that this higher-order network representation modeled by BuildHON+ is significantly more accurate in identifying anomalies than FON, demonstrating a need for the higher-order network representatio 
    more » « less
  3. null (Ed.)
  4. Abstract

    Rapid climate change has wide-ranging implications for the Arctic region, including sea ice loss, increased geopolitical attention, and expanding economic activity resulting in a dramatic increase in shipping activity. As a result, the risk of harmful non-native marine species being introduced into this critical region will increase unless policy and management steps are implemented in response. Using data about shipping, ecoregions, and environmental conditions, we leverage network analysis and data mining techniques to assess, visualize, and project ballast water-mediated species introductions into the Arctic and dispersal of non-native species within the Arctic. We first identify high-risk connections between the Arctic and non-Arctic ports that could be sources of non-native species over 15 years (1997–2012) and observe the emergence of shipping hubs in the Arctic where the cumulative risk of non-native species introduction is increasing. We then consider how environmental conditions can constrain this Arctic introduction network for species with different physiological limits, thus providing a tool that will allow decision-makers to evaluate the relative risk of different shipping routes. Next, we focus on within-Arctic ballast-mediated species dispersal where we use higher-order network analysis to identify critical shipping routes that may facilitate species dispersal within the Arctic. The risk assessment and projection framework we propose could inform risk-based assessment and management of ship-borne invasive species in the Arctic.

     
    more » « less
  5. Abstract

    The spread of nonindigenous species by shipping is a large and growing global problem that harms coastal ecosystems and economies and may blur coastal biogeographical patterns. This study coupled eukaryotic environmental DNA (eDNA) metabarcoding with dissimilarity regression to test the hypothesis that ship‐borne species spread homogenizes port communities. We first collected and metabarcoded water samples from ports in Europe, Asia, Australia and the Americas. We then calculated community dissimilarities between port pairs and tested for effects of environmental dissimilarity, biogeographical region and four alternative measures of ship‐borne species transport risk. We predicted that higher shipping between ports would decrease community dissimilarity, that the effect of shipping would be small compared to that of environment dissimilarity and shared biogeography, and that more complex shipping risk metrics (which account for ballast water and stepping‐stone spread) would perform better. Consistent with our hypotheses, community dissimilarities increased significantly with environmental dissimilarity and, to a lesser extent, decreased with ship‐borne species transport risks, particularly if the ports had similar environments and stepping‐stone risks were considered. Unexpectedly, we found no clear effect of shared biogeography, and that risk metrics incorporating estimates of ballast discharge did not offer more explanatory power than simpler traffic‐based risks. Overall, we found that shipping homogenizes eukaryotic communities between ports in predictable ways, which could inform improvements in invasive species policy and management. We demonstrated the usefulness of eDNA metabarcoding and dissimilarity regression for disentangling the drivers of large‐scale biodiversity patterns. We conclude by outlining logistical considerations and recommendations for future studies using this approach.

     
    more » « less